System and method for state estimation in a noisy machine-learning environment

ABSTRACT

A system and method for estimating a system state. The method includes making a first measurement and a second measurement of a value of a characteristic of a state of a system. The method includes constructing first filter measurement and time estimates after the second measurement coinciding with the first measurement including corresponding covariance matrices describing an accuracy of the first filter measurement and time estimates. The method includes constructing second filter measurement and time estimates coinciding with the second measurement including corresponding covariance matrices describing an accuracy of the second filter measurement and time estimates. The method includes constructing a smoothing estimate from the first and second filter measurement estimates. The method includes constructing a first prediction estimate that provides a forecast of a value of the characteristic of the state of the system including a first prediction covariance matrix describing an accuracy of the first prediction estimate.

CROSS-REFERENCE TO RELATED APPLICATION

This application is a Continuation of U.S. patent application Ser. No. 16/674,848 entitled “System And Method For State Estimation In A Noisy Machine-Learning Environment” filed on Nov. 5, 2019 which claims benefit to U.S. Provisional Patent Application No. 62/756,044, entitled “Hybrid AI,” filed Nov. 5, 2018, which is incorporated herein by reference.

This application is related to U.S. application Ser. No. 15/611,476 entitled “PREDICTIVE AND PRESCRIPTIVE ANALYTICS FOR SYSTEMS UNDER VARIABLE OPERATIONS,” filed Jun. 1, 2017, which is incorporated herein by reference. (INC-026)

This application is related to U.S. Provisional Application No. 62/627,644 entitled “DIGITAL TWINS, PAIRS, AND PLURALITIES,” filed Feb. 7, 2018, converted to U.S. application Ser. No. 16/270,338 entitled “SYSTEM AND METHOD THAT CHARACTERIZES AN OBJECT EMPLOYING VIRTUAL REPRESENTATIONS THEREOF,” filed Feb. 7, 2019, which are incorporated herein by reference. (INC-030)

This application is related to U.S. application Ser. No. 16/674,885 (Attorney Docket No. INC-031B), entitled “SYSTEM AND METHOD FOR ADAPTIVE OPTIMIZATION,” filed Nov. 5, 2019, U.S. application Ser. No. 16/674,942 (Attorney Docket No. INC-031C), entitled “SYSTEM AND METHOD FOR CONSTRUCTING A MATHEMATICAL MODEL OF A SYSTEM IN AN ARTIFICIAL INTELLIGENCE ENVIRONMENT,” filed Nov. 5, 2019, and U.S. application Ser. No. 16/675,000 (Attorney Docket No. INC-031D, entitled “SYSTEM AND METHOD FOR VIGOROUS ARTIFICIAL INTELLIGENCE,” filed Nov. 5, 2019, which are incorporated herein by reference.

RELATED REFERENCES

Each of the references cited below are incorporated herein by reference.

U. S. Patents:

Pat. No. Issue Date Patentee 10,068,170 Sep. 4, 2018 Golovashkin, et al.

U.S. Patent Application Publications:

Publication Kind Publication Number Code Date Applicant 20190036639 A1 Jan. 31, 2019 Huang; Yan; et al.

Nonpatent Literature Documents:

-   Makridakis, S. et al., “Statistical and Machine Learning Forecasting     Methods: Concerns and Ways Forward” (2018) -   Box, G. E. P. et al., “Time Series Analysis Forecasting and Control”     (2016) -   Hyndman, R. J. and Athanasopoulos, G., “Forecasting: Principles and     Practice” (2014) -   Brown, R. G. and Hwang, P. Y. C., “Introduction to Random Signals     and Applied Kalman Filtering with MATLAB Exercises” (2012) -   Grewal, M. S. and Andrews, A. P., “Kalman filtering: Theory and     Practice Using MATLAB” (2008) -   Jategaonkar, R., “Flight Vehicle System Identification: A Time     Domain Methodology” (2006) -   Simon, D., “Optimal State Estimation: Kalman, H-infinity, and     Nonlinear Approaches” (2006) -   Zarchan, P., “Fundamentals of Kalman Filtering: A Practical     Approach” (2005) -   Desat, U. B. et al., “Discrete-Time Complementary Models and     Smoothing Algorithms: The Correlated Noise Case” (1983) -   Van Loan, C. F., “Computing Integrals Involving the Matrix     Exponential” (1978) -   Gelb, A., “Applied Optimal Estimation” (1974) -   Fraserand, D. C. and Potter, J. E., “The Optimum Linear Smoother as     a Combination of Two Linear Filters” (1969) -   Meditch, J. S., “Stochastic Optimal Linear Estimation and Control”     (1969) -   Raush, H. E. et al., “Maximum Likelihood Estimates of Linear Dynamic     Systems” (1965) -   Kalman, R. E., “A New Approach to Linear Filtering and Prediction     Problems” (1960)

TECHNICAL FIELD

The present disclosure is directed, in general, to state tracking systems and, more specifically, to a system and method for estimating the state of a system in a noisy measurement environment.

BACKGROUND

Niels Bohr is often quoted as saying, “Prediction is very difficult, especially about the future.” Prediction is the practice of extracting information from sets of data to identify patterns and predict future behavior. The Institute for Operations Research and the Management Sciences (INFORMS) defines several types of prediction which have been expanded here for clarity:

-   -   Descriptive Analytics, in which enormous amounts of historical         (big) data are used for describing patterns extracted solely         from the data.     -   Predictive Analytics, where, in addition to descriptive         analytics, subject matter expert (SME) knowledge is included to         capture attributes not reflected in the big data by itself.     -   Decision Analytics, where, in addition to predictive analytics,         influence diagrams (mathematical networks or models) are         architected to address decision-making strategy.     -   Prescriptive Analytics, where, in addition to decision         analytics, advanced mathematical techniques (e.g., optimization)         are leveraged to predict missing data. The techniques of         prescriptive analytics include data modeling/mining, artificial         intelligence (AI), supervised or unsupervised machine learning,         and deep learning. The focus of the present disclosure is with         machine and deep learning with the intent to train a network or         characterize a model for prediction about the future as new data         is assimilated.

Table 1 (below) provides a sampling of the state of the art of machine learning methods, where each method has its own set of implementation requirements; it was presented by Anais Dotis-Georgiou at the Big Data and Artificial Intelligence Conference in 2019 at Addison, TX.

TABLE 1 Soft Hard Regression Classification Clustering Clustering Ensemble Decision Fuzzy-C Hierarchical methods trees means clustering Gaussian Discriminant Gaussian K- process analysis mixture means General linear K-nearest K- model neighbor medoids Linear Logistic Self-organizing regression regression maps Nonlinear naïve regression Bayes Regression Neural tree nets Support vector Support vector machine machine

In the case of supervised learning, the designer is required to manually select features, choose the classifier method, and tune the hyperparameters. In the case of unsupervised learning, some algorithms (e.g., k-means, k-medoid, and fuzzy c-means) require the number of clusters to be selected a priori; principal component analysis requires the data to be scaled, assumes the data is orthogonal, and results in linear correlation; nonnegative matrix factorization requires normalization of the data; and factor analysis is subject to interpretation.

Deep learning brings with it its own set of demands. Enormous computing power through high performance graphics processing units (GPUs) is needed to process big data, on the order of 10⁵ to 10⁶ points. Also, the data must be numerically tagged. Furthermore, it takes a long time to train a model. In the end, because of the depth of complexity, it is virtually impossible to understand how conclusions were reached.

The artificial neural network (ANN) architecture supporting machine/deep learning is supposedly inspired by the biologic nervous system. The model learns through a process called back propagation which is an iterative gradient method to reduce the error between the input and output data. But humans do not back-propagate when learning, so the analogy is weak in that regard. Other drawbacks include the following:

-   -   ANNs are shallow. While little innate knowledge is required on         behalf of the practitioner, architectural selection becomes an         exercise in numerical investigation.     -   Arbitrary numbers of hidden layers and nodes comprise the depth         of deep learning. The practitioner is often unable to explain         why one architecture is used over another, not to mention the         practitioner has no control or influence over what is being         learned.     -   ANNs are greedy and brittle. Big data (and big computing) is         required to train/test the model which often breaks when         presented with new data.     -   ANNs are opaque. There is generally a lack of transparency due         to the difficulty in understanding the connection between inputs         and outputs—especially for deep learning. This unknown         opaqueness leaves the practitioner wondering if the architecture         can be trusted.

FIG. 1 reveals machine learning forecasting performance is worse than the application of statistical methods, when comparing the vertical axis which represents the symmetric mean absolute percentage error of the various methods. FIG. 1 shows exponential smoothing (Error-Trend-Seasonality, “ETS”) as being superior to all others from a study. Noticeably absent from the field of comparison are optimal estimators, e.g., linear quadratic estimators, commonly called Kalman filters. It is believed that the prior art approaches have not included applying optimal estimation techniques to predictive or prescriptive analytics, let alone combining all three techniques: filtering, smoothing, and predicting. This is reasonable since the Kalman filter (without smoothing or predicting), and its variants, is typically used for state estimation in many control system applications.

A system is needed which improves upon machine learning and statistical methods such that the practitioner can perform real-time predictive and prescriptive analytics. The system should avoid the pitfalls of artificial neural networks with their arbitrary hidden layers, iterative feature and method selection, and hyperparameter tuning. Furthermore, the system should not require enormous computing power. Preferably, such a system will overcome state estimation challenges in a noisy measurement environment.

SUMMARY

Deficiencies of the prior art are generally solved or avoided, and technical advantages are generally achieved, by advantageous embodiments of the present disclosure of a system and method for estimating a state of a system. The method includes making a first measurement of a value of a characteristic of a state of a system, and a second measurement of a value of the characteristic of the state of the system after the first measurement. While the second measurement is the last measurement in this example, there may be a plurality of measurements followed by the following steps that apply to the corresponding measurements. After the measurements have been taken, the method carries out a filtering process. The method also includes constructing a first filter measurement estimate after the second measurement coinciding with the first measurement including a first filter measurement covariance matrix describing an accuracy of the first filter measurement estimate, and constructing a first filter time estimate after the first filter measurement estimate including a first filter time covariance matrix describing an accuracy of the first filter time estimate employing a dynamic model of the state of the system. The method also includes constructing a second filter measurement estimate after the first filter time estimate coinciding with the second measurement including a second filter measurement covariance matrix describing an accuracy of the second filter measurement estimate, and constructing a second filter time estimate after the second filter measurement estimate including a second filter time covariance matrix describing an accuracy of the second filter time estimate employing the dynamic model of the state of the system.

After the filtering process, the method carries out a smoothing process. The method includes constructing a smoothing estimate from the first filter measurement estimate and the second filter measurement estimate. The smoothing estimate may be obtained by sweeping backward recursively from the second filter measurement estimate to the first filter measurement estimate. After the smoothing process, the method carries out a prediction. The method includes constructing a first prediction estimate after the smoothing estimate that provides a forecast of a value of the characteristic of the state of the system including a first prediction covariance matrix describing an accuracy of the first prediction estimate employing the dynamic model of the state of the system. Of course, the method can carry out a plurality of prediction estimates providing corresponding forecasts of a value of the characteristic of the state of the system.

The foregoing has outlined rather broadly the features and technical advantages of the present disclosure in order that the detailed description of the disclosure that follows may be better understood. Additional features and advantages of the disclosed embodiments will be described hereinafter, which form the subject matter of the claims. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures or processes for carrying out the same purposes of the present disclosure, and that such equivalent constructions do not depart from the spirit and scope of the disclosure as set forth in the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the present disclosure, reference is now made to the following detailed description taken in conjunction with the accompanying drawings, in which:

FIG. 1 illustrates a graphical representation comparing forecasting performance (Symmetric Mean Absolute Percentage Error) of machine learning and statistical methods for known machine learning processes;

FIG. 2 illustrates a process for smoothing over fixed time intervals;

FIG. 3 illustrates a graphical representation of filtering, smoothing, and prediction performance for analytics;

FIGS. 4A and 4B illustrate performance examples of temperature tracking and vibration filtering in a noisy measurement environment;

FIG. 5 illustrates a flow diagram of an embodiment of a method of estimating the state of a system; and,

FIG. 6 illustrates a block diagram of an embodiment of an apparatus for estimating the state of a system.

Corresponding numerals and symbols in the different figures generally refer to corresponding parts unless otherwise indicated and, in the interest of brevity, may not be described after the first instance.

DETAILED DESCRIPTION

The making and using of exemplary embodiments of the disclosed invention are discussed in detail below. It should be appreciated, however, that the general embodiments are provided to illustrate the inventive concepts that can be embodied in a wide variety of specific contexts, and the specific embodiments are merely illustrative of specific ways to make and use the systems, subsystems, and modules for estimating the state of a system in a real-time, noisy measurement, machine-learning environment. While the principles will be described in the environment of a linear system in a real-time machine-learning environment, any environment such as a nonlinear system, or a non-real-time machine-learning environment, is within the broad scope of the disclosed principles and claims.

Intelligent prediction is a system introduced herein that uniquely combines the three forms of optimal estimation (filtering, smoothing, and predicting) to provide utility for predictive and prescriptive analytics with applications to real-time sensor data. The results of this systematic approach outperform the best of the current statistical methods and, as such, outperform machine learning methods.

To perform predictive and prescriptive analytics in real-time and avoid the pitfalls of machine learning which utilizes artificial neural networks, the system architecture introduced herein is based on combining three types of estimation: filtering, smoothing, and predicting, which is an approach new to forecasting, and which it is believed has not been previously considered with regard to machine learning, as illustrated in Table 1.

FIG. 1 , to which reference is now made, illustrates on the vertical axis a symmetric mean absolute percentage error for forecasting performance of machine learning and statistical methods for known machine learning processes. The first part of the combined three-part system is an implementation of a discrete-time Kalman filter. The filtering form of optimal estimation is when an estimate coincides with the last measurement point.

With the following definitions:

-   -   x_(k)=(n×1), state vector at time t_(k)     -   ϕ_(k)=(n×n), state transition matrix     -   w_(k)=(n×1), process white noise, w_(k)˜(0,Q_(k))     -   Q_(k) is the covariance of the process noise     -   z_(k)=(m×1), measurement vector at time t_(k)     -   H_(k)=(m×n), measurement matrix     -   v_(k)=(m×1), measurement white noise, v_(k)˜(0, R_(k))     -   R_(k) is the covariance of the measurement noise,         the dynamic process is described by

x _(k+1)=ϕ_(k) x _(k) +w _(k),

measurements are described by

z _(k) =H _(k) x _(k) +v _(k),

and initial conditions given by

{circumflex over (x)} ₀ ⁻ =E[x ₀]

P ₀ ⁻ =E[(x ₀ −{circumflex over (x)} ₀ ⁻)(x ₀ −{circumflex over (x)} ₀ ⁻)^(T)].

The discrete-time Kalman filter recursive equations are given by

K _(k) =P _(k) ⁻ H _(k) ^(T)(H _(k) P _(k) ⁻ H _(k) ^(T) +R _(k))⁻¹Filtering gain

{circumflex over (x)} _(k) ={circumflex over (x)} _(k) +K _(k)(z _(k) −H _(k) {circumflex over (x)} _(k) ⁻)State measurement estimate

P _(k)=(I−K _(k) H _(k))P _(k) ⁻(I−K _(k) H _(k))^(T) +K _(k) R _(k) K _(k) ^(T) State measurement covariance

{circumflex over (x)} _(k+1) ⁻=ϕ_(k) {circumflex over (x)} _(k)State time estimate at a next time step t _(k+1)

P _(k+1) ⁻=ϕ_(k) P _(k)ϕ_(k) ^(T) +Q _(k) State time covariance(also at next time step)

It is worthy to mention C. F. van Loan's method is employed to compute ϕ_(k) and Q_(k). As previously mentioned, the Kalman filter has numerous applications for guidance, navigation, and control of aerospace vehicles, e.g., aircraft, spacecraft, rockets, and missiles. However, the filter will be combined, as introduced herein, with smoothing and predicting with applications to (possibly real-time) predictive and prescriptive analytics.

The second part of the three-part system is an implementation of discrete fixed-interval smoothing. The smoothing form of optimal estimation is when an estimate falls within a span of measurement points. For the proposed system, the time interval of the measurements is fixed (hence the name) and optimal estimates of the (saved) states{circumflex over (x)}_(k) are obtained.

With initial conditions given by the last a posteriori estimate and covariance from the filter

x _(T) ^(s) =x _(r)

P _(T) ^(s) =P _(T),

the smoother sweeps backward recursively

C _(k) =P _(k)ϕ_(k) ^(T)(P _(k+1) ⁻)⁻¹Smoothing gain

x _(k) ^(s) ={circumflex over (x)} _(k) +C _(k)(x _(k+1) ^(s)−ϕ_(k) {circumflex over (x)} _(k))State smoothing estimate

P _(k) ^(s) =P _(k) +C _(k)(P _(k+1) ^(s) −P _(k−1) ⁻)C _(k) ^(T) State smoothing covariance

The third part of the three-part system is an implementation of a predictor. The predicting form of optimal estimation is when an estimate falls beyond the last measurement point. The equations for the predictor are identical to the filter with the following three exceptions:

-   -   (1) The initial conditions are given by the last a posteriori         estimate and covariance of the smoother:

{circumflex over (x)} ₀ ⁻ =x _(T) ^(s)

P ₀ ⁻ =P _(T) ^(s),

-   -   (2) The covariance of the measurement noise R_(k) is set to a         large value rendering the measurements worthless because there         are not any measurements available with prediction.     -   (3) As such, z_(k) is fixed to the value of last measurement.         The predictor propagates forward for the forecast period of         interest. These predictions may be at various points in the         future over various time periods. For example, one analytics         application might predict temperature one minute into the future         and/or five minutes into the future. Also, there may be         temperature predictions at hourly or daily rates. These example         combinations would be dependent on the application, of course.

Those skilled in the art know how to model dynamic process and measurements described by x_(k+1)=ϕ_(k)x_(k)+w_(k) and z_(k)=H_(k) x_(k)+v_(k), respectively. Thus, once initialized with {circumflex over (x)}₀ ⁻ and P₀ ⁻, the five-step Kalman filter iterates recursively until the set of data to be filtered is exhausted resulting in state estimates {circumflex over (x)}_(k) and state covariances P_(k).

Upon saving the state estimates {circumflex over (x)}_(k), the state covariances P_(k), and properly initializing x_(T) ^(s) and P_(T) ^(s) with the last entries of {circumflex over (x)}_(k) (T_(f)) and P_(k) (T_(f)), the three-step smoother iterates recursively with a backward sweep to an earlier time point, as illustrated in FIG. 2 , showing fixed-intervals smoothing, until all states/covariances are consumed. At this point, the system is prepared for predictive and prescriptive analytics.

The last state estimate and state covariance of the smoother is used to initialize the predictor. The predictor runs just like the five-step Kalman filter with two exceptions: (i) the covariance of the measurement noise R k is set to an arbitrarily large value to indicate the measurements are worthless, because there are not any, and (ii) the measurement z k is fixed to its final value, because that is the last piece of information available.

An application of the disclosed three-part system's implementation is shown in FIG. 3 illustrating an example of filtering, smoothing, and predicting for analytics where the future position of a moving object is predicted. In FIG. 3 , the actual position of an object is depicted on the vertical axis with a dashed line, the filtered position is represented by dots, the smoothed position is denoted by a solid line, and the predicted position is shown with “x” marking the spots. Measurements are filtered and smoothed up until 50 seconds, at which time, predictions are made 30 seconds into the future.

Referring again to FIG. 1 , if performance of a three-part system is better than ETS, then its performance is better than the rest. In this section, details are presented showing filtering with prediction is better than ETS. When combined with smoothing, performance improves and forms the basis of the accurate prediction shown in FIG. 3 .

Data used to perform the analysis was gathered from a live, operating Horizontal Pump System (HPS). Filtering and predicting was tested with eighteen data sets. Measurements were taken every hour so that one forecasting period represents one hour of elapsed time. The data measures various components of the HPS including bearing and winding temperatures in the motor, pump vibration and suction pressure, overall system health, and other attributes of the system.

Each data set includes noise which may vary with time. The different sources of data provide a mixture of different characteristics such as seasonality, trends, impulses, and randomness. For example, temperature data is affected by the day/night (diurnal) cycle which creates a (short) seasonal characteristic. Vibration data, however, is not affected by the day/night cycle and is not seasonal but does contain a significant portion of randomness.

A missed prediction occurs when an observed measurement exceeds a threshold value, but no forecast was produced which predicted the exception. Any forecast which predicted the exception within twelve periods leading up to the exception was not considered because such a short forecast is not useful. A prediction strategy should produce as few missed predictions as possible.

Turning now to Table 2 (below), illustrated are temperature and vibration filtering after filtering and predicting, showing sensitivities, forecast lengths, and average percent of missed predictions.

TABLE 2 Average Forecast % Missed Length Predictions Sensi- Periods Low High Strategy tivity (Days) Noise Noise ETS High 24 (1) 18.92% 43.19% Filter/Predictor High 24 (1)  0.00%  0.00% ETS Medium 24 (1) 16.67% 42.94% Filter/Predictor Medium 24 (1)  0.00%  0.00% ETS Low 24 (1) 61.17% 62.05% Filter/Predictor Low 24 (1) 24.96% 13.89% ETS High 336 (14)  0.00%  5.56% Filter/Predictor High 336 (14)  0.00%  0.00% ETS Medium 336 (14)  0.00%  2.78% Filter/Predictor Medium 336 (14)  0.00%  0.00% ETS Low 336 (14)  0.00%  5.56% Filter/Predictor Low 336 (14)  0.00%  0.00% Table 2 compares each strategy (ETS versus filter/prediction) over 24 periods (1 day) and 336 periods (14 days). The filter sensitivity column refers to how closely the signal is being tracked. For instance, temperature changes slowly over time so the filter/prediction combination is set to high sensitivity to track the slowly changing signal; whereas vibration, which contains high frequency noise is set to low sensitivity, as illustrated in FIG. 4 showing performance examples of temperature tracking and vibration filtering in a noisy measurement environment. The Average % Missed Predictions column in Table 2 shows the probability that, given the data has crossed a critical threshold, the corresponding strategy failed to predict the event. A lower percentage indicates better performance for this metric. In each scenario (low noise/high noise), the filter/predictor strategy with a high or medium sensitivity correctly predicted every event. Even in the case of low sensitivity, the filter/predictor strategy outperformed the ETS strategy.

The filtering, smoothing, and predicting process introduced herein outperforms the ETS strategy which was the basis of performance assessment over machine learning strategies as shown in Table 1. These results appear to be independent of sensitivity setting (low, medium, or high). Therefore, in general, a practitioner could use the filter/predictor strategy to avoid missed predictions. Furthermore, with the inclusion of smoothing, these results are improved upon as shown in FIG. 3 .

Turning now to FIG. 5 , illustrated is a flow diagram of an embodiment of a method 500 of estimating a state of a system. The method 500 may be employable to estimate a state of a system in a machine learning and/or noisy measurement environment. The method 500 is operable on a processor such as a microprocessor coupled to a memory, the memory contained instructions which, when executed by the processor, are operative to perform the functions. The method 500 begins at a start step or module 505.

At a step or module 510, a first estimate of a state of a system is constructed at a first time including a first covariance matrix describing an accuracy of the first estimate.

At a step or module 520, a second estimate of the state of said system is constructed at a second time, after the first time, including a second covariance matrix describing an accuracy of the second estimate employing a dynamic model of the state of the system; the dynamic model comprises a matrix with coefficients that describes a temporal evolution of the state of the system.

At a step or module 530, a value of a characteristic of the state of the system is measured at the second time. Measuring the value of the characteristic can include making a plurality of independent measurements characterized by a diagonal measurement covariance matrix. At a step or module 540, the second estimate of the state of the system and the second covariance matrix are adjusted based on the value of the characteristic.

At a step or module 550, a third estimate of the state of the system is constructed at a third time, before the second time, including a third covariance matrix describing an accuracy of the third estimate employing the dynamic model of the state of the system.

At a step or module 560, a fourth estimate of the state of the system is constructed at a fourth time, after the second time, from the second estimate. In some embodiments, the fourth time is on a different time scale from the first, second and third times.

At a step or module 570, the dynamic model is altered in response to the value of the characteristic.

At a step or module 580, the state of the system is reported based on the fourth estimate.

At a step or module 590, a fifth estimate of the state of the system is constructed at a fifth time, after the second time, from the second estimate.

In certain embodiments, the dynamic model is a linear dynamic model with constant coefficients. In an embodiment, constructing the first estimate and constructing the second estimate are performed by a Kalman filter.

At a step or module 595, the state of the system is altered based on the fourth estimate.

The method 500 terminates at end step or module 598.

The impacts to implementation of predictive analysis of processes introduced herein cannot be understated. Whereas machine learning approaches are directly dependent on a large and fully populated training corpus, purely statistical approaches, such as ETS and the novel filter/predictor strategy introduced herein, learn directly from the real-time signal with additional data or knowledge imposed. Based upon the findings indicated in Table 1, the established ETS approach is already of better performance than the more widely used machine learning techniques. The improvements and advantages of the process introduced herein over ETS (shown in Table 2) only solidifies the merits of the new approach.

In short, advantages of the novel filtering, smoothing, and predicting process do not requiring a priori knowledge as it does for machine learning techniques. Because the system combines optimal estimation techniques of filtering, smoothing, and predicting, there are no dependencies on artificial neural nets and their (shallow, greedy, brittle, and opaque) shortcomings.

Turning now to FIG. 6 , illustrated is a block diagram of an embodiment of an apparatus 600 for estimating the state of a system in a machine learning environment. The apparatus 600 is configured to perform functions described hereinabove of constructing the estimate of the state of the system. The apparatus 600 includes a processor (or processing circuitry) 610, a memory 620 and a communication interface 630 such as a graphical user interface.

The functionality of the apparatus 600 may be provided by the processor 610 executing instructions stored on a computer-readable medium, such as the memory 620 shown in FIG. 6 . Alternative embodiments of the apparatus 600 may include additional components (such as the interfaces, devices and circuits) beyond those shown in FIG. 6 that may be responsible for providing certain aspects of the device's functionality, including any of the functionality to support the solution described herein.

The processor 610 (or processors), which may be implemented with one or a plurality of processing devices, perform functions associated with its operation including, without limitation, performing the operations of estimating the state of a system, computing covariance matrices, and estimating a future state of the system. The processor 610 may be of any type suitable to the local application environment, and may include one or more of general-purpose computers, special purpose computers, microprocessors, digital signal processors (“DSPs”), field-programmable gate arrays (“FPGAs”), application-specific integrated circuits (“ASICs”), and processors based on a multi-core processor architecture, as non-limiting examples.

The processor 610 may include, without limitation, application processing circuitry. In some embodiments, the application processing circuitry may be on separate chipsets. In alternative embodiments, part or all of the application processing circuitry may be combined into one chipset, and other application circuitry may be on a separate chipset. In still alternative embodiments, part or all of the application processing circuitry may be on the same chipset, and other application processing circuitry may be on a separate chipset. In yet other alternative embodiments, part or all of the application processing circuitry may be combined in the same chipset.

The memory 620 (or memories) may be one or more memories and of any type suitable to the local application environment, and may be implemented using any suitable volatile or nonvolatile data storage technology such as a semiconductor-based memory device, a magnetic memory device and system, an optical memory device and system, fixed memory and removable memory. The programs stored in the memory 620 may include program instructions or computer program code that, when executed by an associated processor, enable the respective apparatus 600 to perform its intended tasks. Of course, the memory 620 may form a data buffer for data transmitted to and from the same. Exemplary embodiments of the system, subsystems, and modules as described herein may be implemented, at least in part, by computer software executable by the processor 610, or by hardware, or by combinations thereof.

The communication interface 630 modulates information for transmission by the respective apparatus 600 to another apparatus. The respective communication interface 630 is also configured to receive information from another processor for further processing. The communication interface 630 can support duplex operation for the respective other processor 600.

In summary, the inventions disclosed herein combine three techniques of optimal estimation of the state of a system. The three techniques include filtering, smoothing, and predicting processes, and can be performed, without limitation, in a machine learning and/or a noisy measurement environment.

The filtering portion of optimal estimation is performed to construct a first estimate of a state vector x_(k) at a time point t_(k) that coincides with a measurement of a value of characteristic of the state of the system at the time point t_(k). The filtering process employs a covariance matrix that describes the accuracy of the first estimate of the state vector x_(k) at the time point t_(k). A second estimate of the state vector x_(k+1) at the time point t_(k+1) is then constructed by propagating the state of the system forward to a second time point t_(k+1), the second time point being after the first time point. The propagating forward employs a dynamic model of the state of the system to produce the estimate of the state vector x_(k+1) at the second time point t_(k+1). The first estimate of the state vector x_(k) and constructing the second estimate of the state vector x_(k+1) can be performed by employing a Kalman filter.

The dynamic model can employ a matrix with coefficients that describes temporal evolution of the state of the system. In certain embodiment, the dynamic model is a linear dynamic model with constant coefficients.

A value of a characteristic of the state of the system x_(k+1) is measured at the second time point t_(k+1). The second estimate of the state of the system and the second covariance matrix are adjusted based on the measured value of the characteristic at the second time point t_(k+1)

Measuring the value of the characteristic can include making a plurality of independent measurements characterized by a diagonal measurement covariance matrix.

The smoothing portion of optimal estimation is performed by constructing a third state estimate for a time point that is earlier than the time point t_(k+1). The earlier time point can fall within or before a span of current measurement points, e.g., between or before the time points t_(k) and t_(k+1).

The predicting portion then propagates the state estimate forward for a forecast period of interest. The last state estimate and state covariance of the smoother can be used to initialize the predicting. The predictions may be at various time points in the future and over various time scales that are after the second time point. Measurement noise R_(k) can be set to an arbitrarily large value to accommodate the inherent absence of a state measurement at a future time point. The initial conditions for the prediction can be taken as the last a posteriori state estimate and the covariance of the smoother.

As described above, the exemplary embodiments provide both a method and corresponding apparatus consisting of various modules providing functionality for performing the steps of the method. The modules may be implemented as hardware (embodied in one or more chips including an integrated circuit such as an application specific integrated circuit), or may be implemented as software or firmware for execution by a processor. In particular, in the case of firmware or software, the exemplary embodiments can be provided as a computer program product including a computer readable storage medium embodying computer program code (i.e., software or firmware) thereon for execution by the computer processor. The computer readable storage medium may be non-transitory (e.g., magnetic disks; optical disks; read only memory; flash memory devices; phase-change memory) or transitory (e.g., electrical, optical, acoustical or other forms of propagated signals-such as carrier waves, infrared signals, digital signals, etc.). The coupling of a processor and other components is typically through one or more busses or bridges (also termed bus controllers). The storage device and signals carrying digital traffic respectively represent one or more non-transitory or transitory computer readable storage medium. Thus, the storage device of a given electronic device typically stores code and/or data for execution on the set of one or more processors of that electronic device such as a controller.

Although the embodiments and its advantages have been described in detail, it should be understood that various changes, substitutions, and alterations can be made herein without departing from the spirit and scope thereof as defined by the appended claims. For example, many of the features and functions discussed above can be implemented in software, hardware, or firmware, or a combination thereof. Also, many of the features, functions, and steps of operating the same may be reordered, omitted, added, etc., and still fall within the broad scope of the various embodiments.

Moreover, the scope of the various embodiments is not intended to be limited to the particular embodiments of the process, machine, manufacture, composition of matter, means, methods and steps described in the specification. As one of ordinary skill in the art will readily appreciate from the disclosure, processes, machines, manufacture, compositions of matter, means, methods, or steps, presently existing or later to be developed, that perform substantially the same function or achieve substantially the same result as the corresponding embodiments described herein may be utilized as well. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or steps. 

1. A method, comprising: making a first measurement of a value of a characteristic of a state of a system; making a second measurement of a value of said characteristic of said state of said system after said first measurement; constructing a first filter measurement estimate after said second measurement coinciding with said first measurement including a first filter measurement covariance matrix describing an accuracy of said first filter measurement estimate; constructing a first filter time estimate after said first filter measurement estimate including a first filter time covariance matrix describing an accuracy of said first filter time estimate employing a dynamic model of said state of said system; constructing a second filter measurement estimate after said first filter time estimate coinciding with said second measurement including a second filter measurement covariance matrix describing an accuracy of said second filter measurement estimate; constructing a second filter time estimate after said second filter measurement estimate including a second filter time covariance matrix describing an accuracy of said second filter time estimate employing said dynamic model of said state of said system; constructing a smoothing estimate from said first filter measurement estimate and said second filter measurement estimate; and constructing a first prediction estimate after said smoothing estimate that provides a forecast of a value of said characteristic of said state of said system including a first prediction covariance matrix describing an accuracy of said first prediction estimate employing said dynamic model of said state of said system.
 2. The method as recited in claim 1 further comprising constructing a second prediction estimate after said first prediction estimate that provides another forecast of a value of said characteristic of said state of said system including a second prediction covariance matrix describing an accuracy of said second prediction estimate employing said dynamic model of said state of said system.
 3. The method as recited in claim 1 further comprising constructing a plurality of prediction estimates that provides a corresponding plurality of forecasts of a value of said characteristic of said state of said system including a corresponding plurality of prediction covariance matrices describing an accuracy of said plurality of prediction estimates employing said dynamic model of said state of said system.
 4. The method as recited in claim 1 wherein constructing said smoothing estimate comprises sweeping backward recursively from said second filter measurement estimate to said first filter measurement estimate.
 5. The method as recited in claim 1 further comprising altering said state of said system based on said first prediction estimate.
 6. The method as recited in claim 1 wherein said constructing said first filter measurement estimate, said first filter time estimate, said second filter measurement estimate and said second filter time estimate are performed by a Kalman filter.
 7. The method as recited in claim 1 further comprising reporting said state of said system based on said first prediction estimate.
 8. The method as recited in claim 1 wherein said first measurement comprises a plurality of independent measurements characterized by a diagonal measurement covariance matrix.
 9. The method as recited in claim 1 wherein said dynamic model comprises a linear dynamic model with constant coefficients.
 10. The method as recited in claim 1 wherein said dynamic model comprises a matrix with coefficients that describes a temporal evolution of said state of said system.
 11. An apparatus operable to construct a state of a system, comprising: processing circuitry coupled to a memory, configured to: make a first measurement of a value of a characteristic of said state of said system; make a second measurement of a value of said characteristic of said state of said system after said first measurement; construct a first filter measurement estimate after said second measurement coinciding with said first measurement including a first filter measurement covariance matrix describing an accuracy of said first filter measurement estimate; construct a first filter time estimate after said first filter measurement estimate including a first filter time covariance matrix describing an accuracy of said first filter time estimate employing a dynamic model of said state of said system; construct a second filter measurement estimate after said first filter time estimate coinciding with said second measurement including a second filter measurement covariance matrix describing an accuracy of said second filter measurement estimate; construct a second filter time estimate after said second filter measurement estimate including a second filter time covariance matrix describing an accuracy of said second filter time estimate employing said dynamic model of said state of said system; construct a smoothing estimate from said first filter measurement estimate and said second filter measurement estimate; and construct a first prediction estimate after said smoothing estimate that provides a forecast of a value of said characteristic of said state of said system including a first prediction covariance matrix describing an accuracy of said first prediction estimate employing said dynamic model of said state of said system.
 12. The apparatus as recited in claim 11 wherein said processing circuitry is further configured to construct a second prediction estimate after said first prediction estimate that provides another forecast of a value of said characteristic of said state of said system including a second prediction covariance matrix describing an accuracy of said second prediction estimate employing said dynamic model of said state of said system.
 13. The apparatus as recited in claim 11 wherein said processing circuitry is further configured to construct a plurality of prediction estimates that provides a corresponding plurality of forecasts of a value of said characteristic of said state of said system including a corresponding plurality of prediction covariance matrices describing an accuracy of said plurality of prediction estimates employing said dynamic model of said state of said system.
 14. The apparatus as recited in claim 11 wherein said processing circuitry is configured to construct said smoothing estimate by sweeping backward recursively from said second filter measurement estimate to said first filter measurement estimate.
 15. The apparatus as recited in claim 11 wherein said processing circuitry is further configured to alter said state of said system based on said first prediction estimate.
 16. The apparatus as recited in claim 11 wherein said processing circuitry is configured to construct said first filter measurement estimate, said first filter time estimate, said second filter measurement estimate and said second filter time estimate with a Kalman filter.
 17. The apparatus as recited in claim 11 wherein said processing circuitry is further configured to report said state of said system based on said first prediction estimate.
 18. The apparatus as recited in claim 11 wherein said first measurement comprises a plurality of independent measurements characterized by a diagonal measurement covariance matrix.
 19. The apparatus as recited in claim 11 wherein said dynamic model comprises a linear dynamic model with constant coefficients.
 20. The apparatus as recited in claim 11 wherein said dynamic model comprises a matrix with coefficients that describes a temporal evolution of said state of said system. 